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FINITE APPROXIMATIONS TO COHERENT CHOICE 



MATTHIAS C. M. TROFFAES 



Abstract. This paper studies and bounds the effects of approximating loss functions and 
credal sets on choice functions, under very weak assumptions. In particular, the credal set is 
assumed to be neither convex nor closed. The main result is that the effects of approximation 
can be bounded, although in general, approximation of the credal set may not always be prac- 
tically possible. In case of pairwise choice, I demonstrate how the situation can be improved 
by showing that only approximations of the extreme points of the closure of the convex hull 
of the credal set need to be taken into account, as expected. 



1. Introduction 



Classical decision theory tells a decision maker to choose that option which maximises his 
expected utility. A generalisation of this principle is compelling when the probabilities and 
utilities relevant to the problem are not well known. Choice functions are one such generalisation, 
and select a set of optimal options: instead of pointing to a single solution based on possibly 
wrong assumptions, choice functions provide a set of optimal options. The decision maker can 
then investigate further if the set is too large, or not, if for instance the optimal set is a singleton, 



or if a single option from the set stands out from the rest by other arguments. 

However, in modelling decision problems, we often afford ourselves the luxury of infinite spaces 
and infinite sets, making those problems sometimes hard to solve analytically. In such cases we 
must resort to computers, and these cannot handle random variables on infinite spaces, let alone 
arbitrary infinite sets of probabilities. Hence, in that case we must approximate our infinite sets 

^v , by finite ones. By taking the finite sets sufficiently large, hopefully the approximation reflects 

the true result accurately. This paper confirms this intuition when modelling choice functions 

ON . induced by arbitrary (not necessarily convex) sets of probabilities and a single cardinal utility, 

extending similar results known in classical decision theory [5J |TT] . 

The paper is organised as follows. Section[5Jintroduces notation, and briefly reviews the theory 
of coherent choice functions and their role in decision theory. In Section [3] the building blocks 
for a theory of approximation are introduced, along with some useful results on what they imply 
for loss functions, sets of probabilities, and expected utility. The main part of the paper begins 
in Section 31 studying and bounding the effects of approximation on coherent choice functions. 
Section [5] improves the results of the previous section for pairwise choice. Section [5] concludes 
the paper. Some essential but technical results on approximating the standard simplex in W 1 
are deferred to an appendix. 

2. Choice Functions 

Let fi denote an arbitrary set of states. Bounded random quantities on fi, i.e. bounded maps 
from fl to K., are also called gambles [15] . and will be denoted by /, g, ... £(fi) denotes the set 
of all gambles on f2. Finitely additive probability measures, or briefly probability charges [2], are 



Key words and phrases, decision making, E-admissibility, maximality, numerical analysis, lower prevision, 
sensitivity analysis. 
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denoted by P, Q, . . . and 7^(17) denotes the set of all probability charges on the power set p(17) 

of a 

In a decision problem, we desire to choose an optimal option d from a set D of options. 
Choosing d induces an uncertain reward r from a set R of rewards, with probability charge 
Hd(-\w) over p(R), depending on the outcome of the uncertain state w G 17. For each w G 17, 
Hd{-\w) is a lottery over R, and as a function of w, //d(- 1 •) : w ^ /^(-|iu) is a horse lottery or act. 

If we model our belief about states and rewards by a probability charge P on jp(17) and a state 
dependent utility function U(-\w) on R, then utility theory [T71[TJ|3] tells us to choose a decision 
d which maximises the expected utility, or prevision: 



E(d) = I ( I U(r\w)dn d (r\w) ) dP(w) 
Jn \Jr 

f d (w)dP(w) 



where fd(w) — f R U(r\w) dfj,d(r\w) is the gamble associated with decision d, and the integrals 
are Dunford integrals [2]. For simplicity, in this paper, we assume U(r\w) to be bounded, i.e. 

s\xpU(r\w) — inf U(r\w) < +oo 

Among other things, this ensures that relative approximation can be defined, as in Section [3J 
without technical complications. 

A decision which maximises expected utility is called a Bayes decision for the decision problem 

(n,D,p,u). 

However, if we are not sure about the probability of all events and the utility of all rewards, 
a more reliable design is to use a family (P a ,U a ) a en °f probability-utility pairs (where K is an 
arbitrary index set), and to elicit from D those options which maximise expected utility with 
respect to at least one of the pairs (P a , U a ). First, for each a € H, let 

E a (d)= J fS(w)dP a (w) 

Jn 

where f2(w) = j R U a (r\w) dfj,d(r\w) is the gamble associated with decision d and model a G H. 
Then we define: 

Definition 1. A decision d G D is called an optimal decision for the decision problem (17, D, {P a , U a ) a ^n) 
if d belongs to the set 

opt(f2, D, (P a , U a ) aeii ) = {deD:(3ae N)(Ve G D){E a {d) > E a (e))} 



d G D: {3a G H) E a {d) = sup£' Q (e' 

V e6_D 

As such, the operator opt selects a set of optimal decisions, namely all decisions which are 
Bayes with respect to (17,_D,P a , U a ) for at least one a G H. Such an operator is called a choice 
function or optimality operator [3j [16] . 

In case (P a , U a ) a £x — M. x U for some convex sets M and U, optimality as defined above is 
also called E- admissibility [9, Sec. 4.8]. 

There are many ways to define a choice function starting from a set (P a , U a ) ae x (see [§1 IT4"1 [T51 
8, 16 ). The one in Definition Q] satisfies an interesting set of axioms [SI US]) and is the subject of 
a representation theorem in case utility is precise and state independent (i.e. if U a (r\w) depends 
neither on a nor on w) and 17 is finite (for infinite 17 the representation theorem is subject to 
additional constraints, which preclude merely finitely additive probabilities over 17) [13) . 
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For the sake of simplicity, we shall only be concerned about decision problems with precise 
and state independent utility functions, i.e. when (P a ,U a ) ae n = M. x {U} with [/:i{->la 
bounded state independent utility over R and 

The set Ai is called a credal set as it represents our belief about w G £1. We can identify M. 
itself as index set, and write 



E P {d)= / f d (w)dP(w) 
Jn 

with f d (w) = J R U(r) d/j,d{r\w), for any P 6 M. 

Finally, defining the loss function L: D x £1 — > M. as L(d,w) = —fd(w), the expected value 

Ep(d) is uniquely determined by P and L alone: we need not be concerned explicitly with R, 

fid{r\w), and £/(r). 

3. Approximate Gambles, Probabilities, and Previsions 

Let A = {Ai, . . . , A n } denote a finite partition of 0. As we approximate Q by the finite set 
A, we also need to approximate decisions, gambles, and probability charges on 0. 
Let e > 0. For a gamble / in C(il) and a gamble / in C(A), we shall write / ^ e f if 



max sup 

AeA weA 



f(w)-f(A) <[sup/-inf/]e 



Note that / ~ e / implies af + b ^ e af + b, for any real numbers a and 6, a > 0. Therefore, the 
relation ~ e is invariant with respect to positive linear transformations of utility: it only depends 
on our preferences over lotteries, and not on our particular choice of utility scale. 

For a probability charge P in V(il), and a probability charge P in V(A), we shall write P ~ e P 
if 

J2\P(A)-P(A)\<e 
AeA 

Note that this implies |P(A) — P(A)\ < e for any A G p{A). Also note the differences between 
the definitions of ^ e for gambles and bounded charges. 

For a loss function L on D x Q, and a loss function L on D x A we write L ^ e L if for all 

de D 

fd ^e fd 

(with f d (w) = -L(d, w) and f d (A) = -L(d,A)). 

For a subset M of P(fi) and a subset M. of "P(-4), we write M ~ e 7W if for every P in M. 
there is a P in M. such that P ~ e P, and for every P in A4 there is a P in M. such that P ~ e P. 

A few useful results about approximations are stated in the next lemmas. 

Lemma 2. Assume that D is finite. Then, for every loss function L on D x O and every 
e > Q, there is a finite partition A of VL and a loss function L on D x A such that L ~ f L and 

\A\ <(l + l/e)\ D \. 

Proof. Consider any d in D, and let Rd = sup fd — inf fd- Because fd is bounded, we can embed 
the range of fd in k intervals p, . . . , Ik of length Rd£, say 

[inf f d , inf f d + R d e), [inf f d + R d e, inf f d + 2R d e), ..., [inf f d + (k - l)R d e, inf f d + kR d e) 

with k such that sup/d £ Ik- Therefore, inf fd + (k — l)Rd£ < sup/^ < inf fd + kRde and hence 
k — 1 < 1/e < k. Observe that k is independent of d € D. 
The sets A\ , . . . , A^ defined by 
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e: 
0.2 


0.1 


0.05 


0.02 


0.01 


\D\:2 


1.6 


2.1 


2.6 


3.4 


4.0 


4 


3.1 


4.2 


5.3 


6.8 


8.0 


8 


6.2 


8.3 


10.6 


13.7 


16.0 


16 


12.5 


16.7 


21.2 


27.3 


32.1 


32 


24.9 


33.3 


42.3 


54.6 


64.1 



Table 1. Upper bound on log 10 (|.4|), i.e. the logarithm of the cardinality of the 
finite partition A for various values of precision e > and number of decisions 
(see Lemma [5]). 



form a finite partition Ad 
fd £ C(Ad) defined by 



{Aj : Aj ^ 0} of cardinality \Ad\ < k < 1 + 1/e and the gamble 



h{Ai) 



inf f d {w) 

w£Ai 



satisfies 



sup 

wEAj 



f d (w) - f d (Aj) 



sup 



fd(w) 



inf f d (w) 

fd(w)ei 3 



< supi, — inf Ij = Rdt 



for all Aj £ Ad] hence fd ~ e fd- Defining L(d, A) = —fd(A) for all d £ D, we have L ~ e L. 

The finite collection of partitions {Ad ■ d £ D} has a smallest common refinement A. Since 
each Ad has no more than 1 + 1/e elements, A has no more than (1 + 1/e)' 15 ' elements. Indeed, 
two partitions of cardinalities k\ and k^ respectively have a smallest common refinement of 
cardinality no more than k\ki. By induction, n partitions of cardinalities k\, ..., k n have a 
smallest common refinement of cardinality no more than Ili=i kj and hence, 

|^|<(i + iA) |D| 



Table [T] lists upper bounds on the size of the partition, to ensure L 
e and \D\, according to Lemma [21 



L, for various values of 



Let (k) be the binomial coefficient, defined for all real numbers a > b > by 

a\ L(a + 1) 



r(6+l)L(a-6 + l) 



with r the Gamma function. 



Lemma 3. For every subset M. ofV(S£), every S > 0, and every finite partition A ofQ, there 



, finite subset M. ofV(A) such that M ~s M- and \M\ < ( 



1-41-1 



Proof. Consider any P in M.. Let n = \A\ and let the elements of A be A\, . . . , A n . Consider 
the vector x = (P(Ai), . . . ,P(A n )) in A n . Let N be the smallest natural number such that 
N > n/S. 

By Lemma [T3] in the appendix, there is a vector y in A^, such that 

\x-y\i < n/N < S 



Define P in V{A) by 



P(Ai 
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0.2 


0.1 


0.05 


\A\:A 


3.3 


4.1 


5.0 


8 


7.9 


9.8 


11.8 


12 


12.5 


15.5 


18.7 


16 


17.1 


21.3 


25.6 


20 


21.8 


27.1 


32.6 


24 


26.4 


32.9 


39.5 


28 


31.1 


38.6 


46.5 


32 


35.8 


44.4 


53.4 


log 10 (|.4|): 0.7 


4.4 


5.5 


6.7 


1.4 


27.6 


34.3 


41.3 


2.1 


144.6 


179.5 


215.5 


2.8 


731.3 


906.8 


1088.2 


3.5 


3666.1 


4544.7 


5452.8 


4.2 


18341.5 


22735.9 


27277.5 


4.9 


91719.7 


113693.0 


136402.5 



Table 2. Upper bound on log 10 (|7W|), i.e. the logarithm of the cardinality of 
the finite set of probability charges M. , for various values of precision S > and 
cardinality of the partition \A\ (see Lemma [3]). 



for all i € {1, ... , n} — by finite additivity, P is well defined on p(A). By construction, P ~g P 
because 

n 

^|p(A i )-P(^ i )| = |x-y|i<<5 
i=i 
Approximating each P in M. in this manner, the set 

M = {P: PeM} 

is finite as each of its elements corresponds to an element of the finite set Ajy, and therefore 
\A4\ < A^|. By Lemma [T2l in the appendix, 

\M\ < 



< 



The second inequality follows from the fact that (^) is strictly increasing in a, for fixed b (for 
integer a and b this follows immediately from Pascal's triangle; the general case follows from the 
properties of the Gamma function) . □ 

Table [5] lists upper bounds on the cardinality of M. on a logarithmic scale, for some values 
of \A\ and 5. The cardinality grows enormously fast with increasing \A\ and 1/5. Within the 
range of Table [5J an exponential trend is obvious. The table shows that the influence of \A\ is 
much larger than the influence of 5: more precisely, doubling |^4| increases \M\ by far more than 
halving S. 

Next, we study the effect on the expectation if both gambles and probabilities are ap- 
proximated. Let us use the notation Ep(f) = J n f(w)dP(w). In the lemma below, assume 
0<e<l/2. 



N + n - 1\ 


(N + n - 1\ 


N J" 


V n-l J 


n/S + 1 + n - 


-1\ f\A\(l + l/S) 


n-l 


J \ \A\-1 
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Lemma 4. For every finite partition A of Q,, every f G C(Q,), f G £-{A), P G V(£l), and 
P G V(A), the following implications hold. If f ~ e / and P ^$ P then 



and 



E P (f) - Ep{f) < [sup / - inf /](e + 8(1 + 2e)) 
E P (f) - E p {f)\ < [sup/ - inf./] [y^2 + S 



Proof. Let R — sup / — inf /, R = sup / — inf /, and write inf a f for mf„,eA f(w) and sup^ / for 
su PweA.f( w )- Then 



E P (f)-E p (f) 



£ ( [ fdP-f(A)P(A) 
AeA KJa 



and since P(A) mi a f < J A f dP < P(A) sup^ /, there is an ta G [inf a f, snp A f] such that 
P(A)rA = J A f dP, and hence 

J2(r A P(A)-f(A)P(A) 
AeA 

but, because \f(w) — f(A)\ < Re for all w G A, and inf a f < ta < sup^ /, it must also 
hold that |r A - ,/(A)| < ife, so \^ A eA (r A P(A) - f(A)P{A)} | < £^ |^a - /V)| P(A) < 
J2agA RtP(A) — Re, whence 



< 



J2{f(A)P(A)-f(A)P(A) 
AeA 

J2f(A)(p(A)-P(A) 



Re 



AeA 



Re 



and because Eaga( P (^) - P ( A )) = °> 

= £(/V)-inf/)(P(^)-P(^) 

AGA 

<£(/(A)-inf/)|p(A)-P(A) 
AeA 

< (sup /-inf/) J] |p(A)-P(A) 

AeA 

< RS + Re 
and since R(l + 2e)>R> R(l - 2e) 

'i?(l + 2e)(5 + #e = i?(e + 5(1 + 2e)) 



i?e 



.Re 



< 



RS + Re/ (I -2e) = R (e/(l - 2e) + (5) 



D 
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logi 



V l/el°l J 



160 
120 

80 
40 




0.05 



0.10 



0.15 



-> e 



0.20 



Figure 1. Upper bound on log 10 \M\ for various values of e, with e + 5 = 0.2 
and IDI = 2. 



Let us now investigate what is the most optimal choice for e > and 5 > 0. The cardinality 
of M. is of largest concern as it grows enormously fast with increasing cardinality of the finite 
partition A and with increasing precision 1/5 (see Table [2]). Therefore, as a first step, let us 
see how we can minimise \M\, assuming a fixed relative error e + S on the expectation (see 
Lemma S]) — omitting higher order terms in e and 5 to simplify the analysis. 

We wish to minimise the upper bound (neglecting lower order terms) 

'(l/(el^r 

l/ e \D\ 

on \M.\ along the e-J-curve 7(e,<5) = e + 6 = 7*. Figure Q] demonstrates a typical case: the e-S- 
ratio has a large impact on the upper bound of \M\. In particular, the curve grows extremely 
large for small e, because a small e corresponds to a large partition A, and the cardinality of the 
partition has a huge impact on the cardinality of M. as shown in Table [2] 

4. Approximate Choice 

Let us now consider again the decision problem (£l,D,A4,L) with state space J7, decision 
space D, credal set A4, and loss function L, and reflect upon how the results in the previous 
section could be of use in finding the optimal decisions opt($7,D, Ai, L). Can we still find the 
optimal decisions after approximating the loss function L and the set of probabilities M? 

As we admit a relative error on gambles and probabilities, and therefore also on previsions, 
we should admit a relative error on the choice function as well. Let Rjj be defined by (recall 
that fd{w) = —L(d,w)) 

R D = sup [sup f d - inf fd] 

deD 

Definition 5. Let e > 0. A decision d in D is called an e-optimal decision for the decision 
problem (fi, D, Ai, L) if it belongs to the set 



opt £ (Q,D,M,L)= \deD:{3PeM)[ sup E P (e) - E P (d) < eR D 

\e£D 



Note that 



opt e {n,D,M,aL + b) =o-pt e (ft, D,M,L) 
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for any real numbers a and 6, a > 0. In other words, opt c ($7,_D, A4, L) is invariant with respect 
to positive linear transformations of utility: e-optimality does not depend on our choice of utility 
scale. 



Clearly, 
because 
whenever e < S, and 



opt(ft, D, M, L) C opt e (ft, D,M,L) 
opt £ (Cl,D,M,L) C opt s (n,D,M,L) 



opt°(fi, D,M,L) = opt(fi, D, .M, L) 

In approximating a decision problem ($7, _D, M., L), we start with a finite partition .A, consider 
a (possibly finite) set .M such that M. ~s -M, and approximate the loss L(d, w) by a loss L(d, A) 
such that L ^ e L. 

Theorem 6. Consider two decision problems (f2, D, Ai,L) and (A,D,M,L). If L ~ e L and 
M. ~s M. then, for any 7 > 0, 

(1) opf*{n,D,M,L) <Zopt^ +2{ -^;+ s )(A,D,M,L) 



(2) opt 7 (A D,M,L)C opt 7(1+2e)+2(e+5(1+2e)) (^, D, M, L) 

Proof. We prove Eq. ((J). Let d e opt T (0, D,M, L). Then 



(3) 



sup E P (f e ) - E P (f d ) < -yR D 

e£D 



for some P € M.. Let P be such that P ~ s P. Because, by Lemma |H 



sup Ep(f e ) - sup Ep(f e 



eeD 



e'eD 



<SU P Ep{f e )~Ep(fe 
eeD 

< sup[sup / e - inf / e ](e/(l - 2e) + 5) 



(4) 

it follows that 



= (e/(l-2e)+<5)i? 



D 



sup Ep(f e ) - Sp(/ d ) < sup E P (f e ) ~ E p (f d ) + (e/(l - 2e) + <J)i? D 

e£_D eeD 



and again by Lemma 21 



and by Eq. ©, 



< sup E P (f e ) ~ Ep{f d ) + 2(e/(l - 2e) + <5)i? D 

eS-D 



< 1 R D + 2(e/(l-2e) + 6)R D 

< [7/(1 - 2e) + 2(e/(l - 2e) + <J)]A D 



hence, d £ opt^ 1 - 2 ^ 2 ^ 1 - 2 -^^, £>, 7W, L). 

Next, we prove Eq. ©. Let d G opt 7 (.4, D,M,L). Then 



(5) 



sup E p (f e )-E p (f d )<jR D 



e££) 



FINITE APPROXIMATIONS TO COHERENT CHOICE 



Because, by Lemma 0] 

E p {f e ) - E P (f e ) 



sup Ep(fe) - sup E P (f e ,) 

eeD e'G-D 



< sup 

eeD 

< sup[sup/ e - inf / e ](e + 5(1 + 2e)) 

eeD 

(6) =(e + 6(l + 2e))R D 

we have that 

sup E P (f e ) - Ep(f) < sup E p (f e ) - E P (f) + (e + 5(1 + 2e))R D 

eeD eeD 



and again by Lemma |4j 
and by Eq. ^ 



< sup £ # (/ e ) - Ep(f e ) + 2(e + ,5(1 + 2e))i? D 



<7i?Z5 + 2(e + 5(l + 2e))i?z3 
< [ 7 (1 + 2e) + 2(e + (5(1 + 2e))]ifo 
so d e opt^+^+^+^+^Hft, £, M, L). D 

If we ignore higher order terms in 7, e, and i5, then the above theorem says that when moving 
from an original decision problem to an approximate decision problem, or the other way around, 
with relative error e in gambles and relative error i5 in probabilities, the relative error in optimality 
increases by 2(e + 8). For example, for small e and <5 the following holds, up to a small error: if 
L ^ e L and M ~$ M, then 

opt(fi, D, M, L) C opt 2( - e+s *> [A, D,M,L)C opt 4 ( £+,5 > (O, D,M,L) 

So, the approximate problem with relative error 2(e + 6) will contain all solutions to the original 
problem with no relative error, and will, so to speak, not contain any solutions to the original 
problem with relative error over 4(e + <5). Because of this property, opt 2 ( c+<5 ) (A, D, Ai, L) seems 
a logical choice when solving decision problems in practice. 

5. Pairwise Choice 

Table [2] reveals that the size of the credal set is a serious computational bottleneck. Therefore, 
it is worth investigating how the size of M. can be reduced, without compromising the accuracy 
<5 > 0. One way to this end is to restrict to pairwise comparisons, i.e. using maximality (see 
Walley QE Sec. 3.7-3.9]). 

5.1. Maximality. 

Definition 7. A decision d £ D is called a maximal decision for the decision problem (fi, D, Ai, L) 
if d belongs to the set 

max(fi, D, M, L) = {d e D : (Ve e D){3P e M) (E P (d) > E P (e))} 

Denote by co(A^) the convex hull of M.. Obviously it holds that 

max(fi, D,M,L) = max(fi, D, co(M),L) 

because for any A 6 [0, 1] and any two P and Q in A4, the inequalities Ep{d) > Ep(e) and 
Eq{d) > Eq(c) imply the inequality 

E\p + (i-\) Q (d) > E X p + ^_ x - )Q (e) 
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This does not hold for optimality as defined in Definition [1] assuming Q finite, for any two 
distinct subsets A4 and M! of V(il), we can always find a set D and a loss function L such that 
opt(Q,D,M,L) ^ opt(ft, D, M', L) (see Kadane, Schervish, and Seidenfeld [SJ Thm. 1, p. 53]). 
To understand why the above notion of optimality is called maximality, consider the strict 
partial ordering > on D defined by 

e > d <S=> (VP G M) {E P (e) > E P {d)) 

for any d and e in D, that is, e is strictly preferred to d if e is strictly preferred to d with respect 
to every P G M. Then, 

max(n,D,M,L) = {d€D: (VeeD)(e^d)} 

so max(fi, D, Ai, L) elects those decisions d which are undominated with respect to >. There- 
fore, maximality can be expressed through pairwise preferences only — again in contrast to 
opt(0, D,Ai, L) as for instance demonstrated by Kadane, Schervish, and Seidenfeld [8] Sec. 4, 
p. 51]. 

However, because 

opt(fi, D,M,L)C max(fi, D,M,L) 
we may interpret max(f2,D, A4,L) as an approximation to opt(Q, D,Ai,L), an approximation 
which discards all preferences but the pairwise ones. 

Let us admit a relative error on the choice function max as well. Recall, Rd = sup deD [sup fy— ■ 
inf/d]- 

Definition 8. Let e > 0. A decision d in D is called an e-maximal decision for the decision 
problem (fi, D, Ai, L) if it belongs to the set 

max e (Cl,D,M,L) = {deD: (Ve e D)(3P e M){E P [e) - E P {d) < eR D )} 

5.2. Approximating Extreme Points. It turns out that we can restrict our attention to the 
extreme points of the closure of the convex hull of A4 , with respect to the topology of pointwisc 
convergence on members of C(Q). This topology is characterised by the following notion of 
convergence: for every directed set (^4, <) and every net (P a )aeA, we have that lim a P a = P if 

]hxiE P Jf) = Ep(f) for all / G C(fl) 

Without further mention, I will assume this topology on V(£l). See for instance [T2] for more 
information regarding nets [HI Chapter 7] and this topology [TH §28.15]. 

There is a nice connection between the closure of Ai, denoted by cl(.A/f), and e-optimality and 
e-maximality. 

Lemma 9. Assume that Rd > 0. Let e > 0. For any decision problem (Q, D, Ai, L). the 

following equality holds: 

(7) max £ (0, D, c\(M), L) = f] max e+s (U, D,M,L) 

s>o 
and if additionally D is finite, then the following equality holds as well: 

(8) opt e (ft, D, c\(M), L) = p| opt £+,5 (fi, D,M,L) 

s>o 

Proof. We start with proving Eq. ([7]). 

Assume d G max e (r2, D, c\(Al),L). Consider any e G D. By assumption, there is a P G cl(7W) 
such that Ep{e) — Ep(d) < Rd£- Because P G cl(.M), there is a net (P a G Ai) a eA such that 
linio, Ep a (/) = Ep(f) for all gambles /. It follows that lim a Ep a (e) — lim Q Ep a (d) < Rd£- 
This implies that for every 5 > 0, there is an a G A such that Ep a (e) — Ep a (f) < (e + 5)Rp>- 
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So, for every S > 0, there is a P G M such that E P (e) - E P (f) < (e + <5)i?pi. Whence, 
because this holds for any e G D, d G nmx e+s (n,D,A4,L) for all S > 0, and therefore, <i G 
n^omax^^,^,^,^. 

Conversely, assume d G P| 5>0 max e+<5 (f2, Z?,.M,L). Consider any e G -D. Then, for all 5 > 0, 
there is a Pj £ M such that Ep s (e) — Ep g (f) < (e + 5)Rp>. Hence, for all n G N, there is a 
P n G M such that 

(9) £ Pn (e) - E Pn (d) < 1/n + eR D 
For any m G N, consider the following closed subset of V^l): 

ftm = c\({P n : n > m}) 

The collection {lZ m : m G N} satisfies the finite intersection property. By the Banach-Alaoglu- 
Bourbaki theorem [H §28.29(UF26)] T(£l) is compact, and hence 

ft = n me N^m 

is non-empty as well [12l §17.2]. 

Take any R G H. Since each P n G M, it follows that each lZ m C c\(M), and hence R G cl(.M ). 
If we can show that Ep{e) — Ep(d) < eRo, then d G max e (51, D, cl(.A/f), L) is established. 

Indeed, fix m G N. Because R G 7£ m , there is a net (P nQ ) a eA in {P n ■ n > m} — so n a > m, 
but n Q is not necessarily an increasing function of a — such that lima Ep n (f e — fd) = Ep(f e —fd)- 
Hence, for each 7 > 0, there is an a G A such that E^(e) — En{d) < Ep n (e) — Ep n (d) + 7, 
and therefore by Eq. ([9]), En{e) — Ep{d) < l/n a + eRp> + 7. Because this inequality holds for 
every m and every 7 > 0, and n a > m, it follows that Ep(e) — Ep(d) < eRn- 

Let us now prove Eq. @, under the additional assumption that D is finite. The proof goes 
along similar lines as the one for Eq. ([7]). 

Assume d G opt e (f2, D,cl(A4),L). By assumption, there is a P G cl(.M) such that Ep(e) — 
Ep(d) < Rdc for every e G D. Because P G c\{M), there is a net (P a G M) a eA such that 
lim Q Ep a (f) — Ep(f) for all gambles /. In particular, there is a net (P a G M) a eA such that 
lim a Ep a (e) — lim a Ep a (d) < i?ue for every eeD. So, for every e G D and 6 > 0, there is an 
a e ,8 G A such that Ep a (e) — Ep a (f) < (e + S)Rp> for all a > a e ,<5- Because D is finite, there is 
an as such that as > a ej( 5 for all e E D. Hence, for every (5 > 0, there is a as G A such that 
Ep a (e) — -Ep (/) < (e + <5)i?u for every e G -D. Whence, because P Qi5 G Ai, it follows that 
d G opt e+,5 (r2, £>,>(,£) for all S > 0, and therefore, d G n5>o P te+5 (^. D,M,L). 

Conversely, assume d G Plaxj opt e+l5 (n, D,A4,L). Then, for all S > 0, there is a Ps G .M such 
that E Ps (e) - i^p^ (/) < (e + 8)Rjj for every e G D. Hence, for all neN, there is a F„ G 7W such 
that for every e G D 

(10) £ P „ (e) - E Pn (d) < 1/n + eR D 

Now choose any R in 

ft = n meN cl({P„ : n > m}) 

Similarly as before, it can be established that 1Z is non-empty and that R G cl(A'f). If we can 
show that Ep(e) — Ep(d) < cRd for all e G D, then d indeed belongs to opt e (f!,,D,c\(A4), L) 
and the desired result is established. 

Indeed, because R G cl({P„ : n > m}), for every e G D, there is a net {P nae ) a eA in {P n '■ n > 
m} — so n a ^ e > m — such that lim Q £'p„ Q e (f e — fd) = Ep{f e — fd). Hence, for every e G D and 
every 7 > 0, there is an a G A such that Eji(e) — Ep(d) < Ep n (e) — Ep n (d) + 7, and 
therefore by Eq. (fT0|) . Ep{e) — Ep{d) < l/n a ^ + eRpi+j- Because this inequality holds for every 
m and every 7 > 0, and n a ^ e > to, it follows that -E'p(e) — Ep{d) < sRd for every e £ D. □ 
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In particular, assuming Rrj > 0, if for any 5 > e > 

max e (fl,D,M,L) = max 5 (Cl,D,M, L) 

then 

max £ (ft, D, M, L) = max £ (tJ, D, cl(M), L) 
A similar result holds for the opt e operator for finite D. 

As a special case, Lemma [§] implies an interesting connection between maximality and e- 
maximality: 

Corollary 10. Assume that Ru > 0. For any decision problem (Q, D, A4, L), the following 
equality holds: 

max(Q,D,d(M),L) = f] max £ (ft, £>,A-i,L) 

Again, a similar result holds for optimality and e-optimality, in case D is finite. 
In the following theorem, assume that < e < 1/2. 

Theorem 11. Consider two decision problems (Cl,D,A4,L) and (A-,D,A4,L). Assume that 
Rd > 0. I/I~ £ L and ext(cl(co(A / l))) ~s .M i/ien, for any 7 > 0, 

(11) max^n, D,M,L)C f] max" + A+2(ra+«) (^ D,M,L) 

v>o 

(12) max' 1, (AA^,i) C p| max^+^ 1+2 <> +2 ( e+5 ( 1+2e »(n,£>,-M,£) 

»)>0 

Proof. First, note that 

max^n, D, M, L) = max 7 (ft, D, co(M),L) 

c max 7 (n, D, d(co(M)),L) 

and by convexity of cl(co(.M)) [T2J §26.23] and the Krein-Milman theorem [51 p. 74], the closed 
convex hull of ext(cl(co(A / f))) is cl(co(A / f)), so 

= max 7 (0, D, cl(co(ext(cl(co(7W))))), L) 

and now by Corollary ITU1 

= n n>0 max 7+ ^(n,D,co(ext(cl(co(.M)))),L) 

= n r;>0 max^ +, '(O, J D,ext(cl(co(X))),i) 

Now apply the same argument as in the proof of Theorem [5] to recover Eq. (|11[) . 
To establish Eq. (TT21 , again use the same argument as in the proof of Theorem 

vaaaC l {A,D,M,l) C max''( 1+2e ) +2 ( e + ,5 ( 1+2e »(f7,L>,ext(cl(co(X))),L) 

Cmax 7 ( 1+2e ) +2 ( e+,5 ( 1+2e »(f7,Li,cl(co(ext(cl(co(7W))))),i) 

and again by the Krein-Milman theorem [BJ p. 74], the closed convex hull of ext (cl(co(.A4))) is 
cl(co(M)), so 

= max^ 1+2e ) +2 ( £+l5 ( 1 + 2£ ))(n,i?,cl(co(M)),i) 

= f| max^ 1+2e ) +2 ^ 1+2e »(0, Aco(A4),£) 

= f| max"+^ 1+2e )+ 2 ( e + 5 ( 1+2e ))(fi,I?,X,L) 

V>0 
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□ 

Again, if we ignore higher order terms in 7, e, and S, then the above theorem says that when 
moving from the original decision problem to the approximate decision problem, with relative 
error e in gambles and relative error S in probabilities, the relative error in maximality increases 
by 2(e + <5). Hence, for small e and S the following holds, up to a small error: if L ~ e L and 
ext(cl(co(.M))) ~s M, then 

max(l], D,M,L)C max 2(t+i) (^, D,M,L)C max 4((+i) (ll, D,M,L) 

Again, max 2 ( e+<5 ' [A, D,Xi,L) seems a logical choice when calculating maximal decisions in prac- 
tice. 



6. Conclusion and Remarks 

With this paper, I hope to have consolidated at least part of our every day intuition when 
approximating decision problems involving sets of probabilities, for instance when those problems 
have to be solved by computer. 

One result is quite depressing: Lemma [5] and Lemma |3] seem to tell us that except in the 
simplest cases, any approximation will need too many resources to be of any practical value, as 
demonstrated by Table [T] and Table [2] 

Fortunately, not all is lost. If we resort to pairwise comparison, we may restrict ourselves 
to the extreme points of the closure of the convex hull of the credal set, which can be much 
smaller than the original credal set. Closing the credal set only has an arbitrary small effect on 
maximality, and in part for this reason, it turns out that approximating extreme points suffices 
when restricting to pairwise preference. 

I wish to emphasise that the bounds on the cardinalities of the approximating partition and 
the approximating credal set are only upper bounds under very weak assumptions. These bounds 
are only attained in extreme situations. In many cases the credal set and the loss function have 
additional structure which may allow for much lower upper bounds. 

In case the problem has sufficient structure, an alternative approach is to develop algorithms 
which do not need to traverse the complete credal set (or an approximation thereof) to compute 
the optimal solution. The imprecise Dirichlet model has already been given considerable attention 
in this direction [TJ. 

Obermeier and Augustin 10J have described a method to approximate decision problems 
by applying Lucehos' adaptive discretisation method to either all elements of the credal set 
(so the partition varies with the distribution), or on a reference distribution of that set. This 
type of approximation aims to preserve the first r moments of a distribution. Although precise 
convergence results and bounds on the precision of this approximation have not yet been proven, 
examples have shown that this method can yield good results in practice. 

Finally, another approach could consist of sampling elements from the credal set, for instance 
through Monte-Carlo techniques, and solve a classical decision problem for each of these elements. 
If the sample s from M is large enough, then — since \J Pes opt (.4, D,P, L) — opt(A,D,s,L) — 
hopefully 



opt(A,D,M,L) = (J opt(A,D,P,L) 



Pes 



The question how large a sample we need to ensure convergence is definitely worth further 
investigation. 



14 matthias c. m. troffaes 

Acknowledgements 

I am grateful to Teddy Seidenfeld for the many helpful discussions on issues related to this 
paper, and also for encouraging me to extend my view on approximations to choice functions. I 
thank Max Jensen for his help in characterising the discretisation of the simplex in R™ , presented 
in the appendix. I also thank all three referees for their constructive comments and useful 
suggestions which have improved the presentation of this paper. The research reported in this 
paper has been supported in part by the Belgian American Educational Foundation. 

Appendix A. Discretisation Of The Standard Simplex In M™ 

In this appendix a simple discretisation of A", the standard simplex in R n , is studied — these 
results are not new and are in fact related to well known notions from combinatorics, in particular 
multisets [15] . The standard simplex A™ is defined as 

A n = {xeR n : x>0, \x\i = 1} 

where | • |i denotes the 1-norm, i.e. \x\i — Yli=i \ x i\- 

For any non-zero natural number N, let A^ denote the following finite subset of A": 

Aw = {m/N: m <= N", |m|i = N} 
(above, N is the set of natural numbers including 0). 
Lemma 12. The cardinality of A^ is (^^j^ -1 )- 

Proof. There is an obvious one-to-one and onto correspondence between A^- and all multisets 
of cardinality N with elements taken from {1, . . . , n} — for any m/N £ AJ^, interpret rrii as the 
multiplicity of i. The number of all such multisets is precisely ( + / ^ _1 ) (see Stanley [T5]). □ 

Lemma 13. For every x in A" there is a y in A^- such that 

\x-y\i <n/N 

Proof. For each i s {1, . . . , n), let rrii be the unique natural number such that Xi £ [rrii/N, (rrii + 
1)/N), or equivalently, let rrii be the largest natural number such that vrn/N < x%. Define 
M = Y%=x rrii. Then, M < N < M + n since M/N = \m/N\i <\x\ x = l and (M + n)/N = 
\{m + l)/N\i > \x\i = 1. Define 

_fl iiie{l,...,N-M} 
e% ~ [0 Hie {N - M + l,...,n} 

and let y = (m + e)/N. Note that y e A^ because \y\i = |m + e|i/iV = (M + (N - M))/N = 1. 
Finally, 

N-M n 

\x-y\i= J2 l^-^l+ E \xi-^\<n/N 

i=\ i=N-M+l 

as \x i - m ^\ <l/iVand |^-^| <1/N. D 
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